Rate constants for diffusive processes by partial path sampling 
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We introduce a path sampling method for the computation of rate constants for systems with 
a highly diffusive character. Based on the recently developed algorithm of transition interface 
sampling (TIS) this procedure increases the efficiency by sampling only parts of complete transition 
trajectories confined within a certain region. The algorithm assumes the loss of memory for highly 
diffusive progression along the reaction coordinate. We compare the new technique to the TIS 
method for a simple diatomic system and show that the computation time of the new method scales 
linearly, instead of quadraticaly, with the length of the diffusive barrier. The validity of the memory 
loss assumption is also discussed. 

PACS numbers: 82.20.Db, 82.20.Sb 



I. INTRODUCTION 

The calculation of rate constants in complex systems 
by straightforward molecular dynamics (MD) simulation 
is prohibited by the exponential dependence of the rate 
on the activation barrier height. The expectation time for 
a reaction can easily be on the order of milliseconds to 
seconds, whereas most simulation algorithms are limited 
to molecular time-steps of femtoseconds. A single occur- 
rence of a rare event in a complex system can thus easily 
exceed current computer capabilities by orders of mag- 
nitude. The standard Bennett-Chandler reactive flux 
method is able to avoid this timescale problem by cal- 
culating the probability to be at the top of the activation 
free energy barrier in combination with a time dependent 
transmission coefficient 1 ' 2 . Although in principle correct, 
the accuracy of this method is very sensitive to the choice 
of reaction coordinate. In a complex reaction, an intu- 
itive simple reaction coordinate can lead to extremely 
low transmission coefficients and, hence, an inaccurate 
or even immeasurable rate constant. Improving the re- 
action coordinate, by, for instanc e, in corporating solvent 
degrees of freedom (see e.g. Ref. |;li| ). is usually a very 
difficult task and requires a lot of a-priori knowledge of 
the system. 

The transition path sampling (TPS) technique of 
Chandler and co-workers£*§iZi2i£ does not need any prior 
knowledge of the reaction coordinate and harvests a col- 
lection of transition paths that connect the reactant with 
the product states. This ensemble of true dynamical 
paths allows detailed understanding of the kinetics and 
mechanism of the reaction. In addition, the rate constant 
can be computed. Processes as diverse as cluster isomer- 
ization, auto dissociation of water, ion pair dissociation, 
the folding of a polypeptide^ and reactions in aqueous 
solution have been studied with TPS (see Ref. [S| for an 
overview). 



As the TPS rate constant calculation is rather com- 
puter time consuming, we recently introduced the tran- 
sition interface sampling (TIS) techniqueii, thereby im- 
proving the efficiency of the rate constant calculation 
substantially. By allowing a variable path length, TIS 
drastically reduces the number of required time-steps for 
each path. In addition, TIS is less sensitive to recrossings 
and has a better convergence compare to the TPS rate 
constant method as it only counts the effective positive 
terms. 

In this paper, we will focus on transitions with a highly 
diffusive character, or in the regime of high solvent fric- 
tion. Examples are the folding and unfolding of a pro- 
tein in water, charge transfer, fragmentation reactions, 
diffusion of a molecule through a membrane, and nuclc- 
ation processes. These types of processes have to over- 
come a relatively fiat and wide, but still rough free en- 
ergy barrier. When applying the TPS (or TIS) shooting 
algorithm to such a transition, the Lyapunov instability 
causes the paths to diverge before the basins of attraction 
have the chance to guide the paths to the proper stable 
state. Pathways will then become very long and, more- 
over, the acceptance ratio of shooting will be low. Hence, 
the shooting algorithm will be very inefficient, resulting 
in bad sampling. Recently, we showed how to sample 
long paths efficiently on a diffuse barrier with TPS by 
introducing a little stochasticity in the trajectories^. 

Here, we will introduce an efficient method to calculate 
the rate constant for such barriers. To do so, we make 
use of the TIS effective flux relationii and assume that 
the diffusivity eliminates any memory effects over a dis- 
tance more than the separation between two interfaces. 
The rate constant can then be recast in a recursive rela- 
tion for the hopping transition rates between interfaces. 
These hopping transition rates can be computed by sam- 
pling short trajectories connecting just three successive 
interfaces. If the assumption of memory loss is valid, this 
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FIG. 1: Illustration of a barrier consisting of a series of 
metastable states. One possible trajectory connecting A and 
B is shown. 



partial path transition interface sampling (PPTIS) pro- 
cedure correctly collects the contributions of all possible 
paths to the rate constant, in principle, even those with 
infinite lengths. 

This paper is organized as follows: In Sec.lTTR we illus- 
trate the PPTIS concept for a simple one dimensional ar- 
ray of well defined metastable states. Although not com- 
pletely without physical importance, the model in this 
section can only describe a limited number of physical 
systems since most diffusive systems do not have well de- 
fined metastable regions in the free energy landscape be- 
tween reactant and product state. In Sec. lIIBl we present 
the TIS technique in a way that facilitates the derivation 
of the rate expression for general diffusive barriers given 
in Sec. Ill CI The implementation of the sampling algo- 
rithm and the analysis of the accuracy of the assumptions 
are discussed in Sec. Ill Dl and III El respectively. A com- 
parison between the TIS and PPTIS methods is made in 
Sec. Ill Fl In Sec. IIIII we test the method and compare 
with the TIS results for a isomerization reaction of a di- 
atomic molecule with an intrinsic long and flat barrier 
immersed in a fluid of repulsive particles. Wc end with 
concluding remarks and prospectives in Sec. IIVI 



II. THEORY 
A. Illustration of the PPTIS concept 

Before embarking on the general case of diffusive barri- 
ers, we will first consider a simple one dimensional system 
that serves as an illustrative example. This system ex- 
hibits a barrier consisting of a series of metastable states 
as is illustrated in Fig. The overall barrier is high 
compared to those between metastable states, so that the 
system shows two state kinetics and an overall rate con- 
stant k,AB is well defined. We assume that the system can 
hop from one metastable state to a neighboring one after 
which it will fully relax. Consequently, the probability to 
hop to left or right does not depend on the history of the 
path, and hence the system behaves Markovian. For this 
type of system, we might write down a master equation 
and solve for all the population densities in each state on 
the barrier as a function of timei^. However, if we assume 
steady state behavior, and take into account the fact that 



the population on the barrier is low, the overall rate con- 
stant is only determined by the hopping probabilities. 
We will denote the probabilities to transfer from site i 
to the right or left metastable state by i^i+i and t^j—i, 
respectively, which are related by t^j+i + 1 = I . For 
a system with n— 1 metastable states: Mi, M%, . . . M„_i 
and the stable states Mo = A and M n = B, the reaction 
rate kAB and its reverse ksA can be expressed as: 



kAB 
kBA 



feo,iT[l -J], 
fc n , n _iT[n - I 



nJ ? 



(i) 



with T[i — >^J the probability to reach via an arbitrary 
number of hops from metastable state i to metastable 
state j before visiting metastable state m. The computa- 
tion of the rate constants only requires the determination 
of the nearest neighbor hopping probabilities and 
the first hopping rates fcrj.i and k n . n ~\. The long distance 
hopping probabilities {T[l — > J ],T\j — I — can be ob- 
tained via following recursive relations (see Appendix lVj) : 
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Starting with T[l = T[0 ->•?] = 1, we can itera- 
tively solve Eqs. (J2J for j = 2, 3 . . . to j = n. In this way 
we collect analytically the statistics of all possible path- 
ways. This procedure accounts for the straight-forward 
barrier crossings, but also accounts for the contributions 
to the rate of an infinite number of different pathways 
that approach from A to B in an infinite number of hops. 
Although the probability of a single pathway is decreas- 
ing with its length, the total contribution of the very long 
pathways becomes more important when n is increased. 
In fact, the average path length increases as ~ ? 
case of uniform symmetric hopping (i^j+i = tj,i- 
for all i) one can show by induction that kAB = —ko,i, 
whereas if we would only account the fastest pathway 
(Mo — ► Mi — > M2 . . . — > M n ) it would be much lower, 

At first sight, it seems a bit surprising that the res- 
idence time in each metastable state and the absolute 
intra-barrier rates fcj,i±i have no influence on the final to- 
tal rate expression. Only the relative rates are important 
as they determine the nearest neighbor hopping probabil- 
ities by U,i±i = fa,i±x / (ki,i+i + h,i-l)- Of course, when 
we start with a system out of equilibrium and calculate 
the relaxation time from A to B for a system that is ini- 
tially completely in A, the intra-barrier rates fci,i±i will 
be dominant factors. 

Our treatment of this model can be related to the so- 
lution of Kramer's equation if one considers a flat high 
barrier of length I. Kramer's equation then gives for the 
rate constant kAB ~ (D / 1) exp(— (3U) where U is the 
barrier height and D the diffusion constant on top of the 



3 



barrier—. The connection becomes clear when one realizes 
fco i/fci,o = exp(— /3£7) and D/l = k/n, with k ~ fc 1; o the 
hopping rate, and n the number of hops on the barrier. 
Hence, kAB — — fco.i 5 j us t as found above for the symmet- 
ric uniform hopping model. A more formal treatment of 
general diffusive Markov processes can be found in e.g. 
Ref. [3. 

The model described above is of limited importance 
due to its highly symmetric and one-dimensional char- 
acter. Some processes, however, such as the diffusion 
of particles through a one-dimensional crystal (e.g alka- 
nes through zeolites) can be described by this uniform 
symmetric hopping model. More complex behavior such 
as diffusion on surfaces, through multi-dimensional crys- 
tals, or in (biological) networks usually has to be studied 
by means of Monte Carlo (MC) algorithms to solve the 
master equation, often called kinetic MC methodsi^i^i 6 ". 
Still, the example given here is illustrative for the more 
complex PPTIS method advocated in this paper. The 
PPTIS method combines the iterative solution of Eq. © 
for the overall rate constant with the TIS algorithm 11 . 
This approach will enable treatment of a much wider va- 
riety of systems with a diffusive character, but not with 
such a rigid structure as the one dimensional Markov 
chain. 



t\ and t[ defined in this way always have positive values. 
Following Ref. [Tl|. we then introduce two- fold charac- 
teristic functions that depend on two interfaces i ^ j. 



Ha*) = 



1 iit\{x) <t){x), 

otherwise 

'l iftf(x) <tj0), 

otherwise 



(4) 



which measure whether the backward (forward) time evo- 
lution of x will reach interface i before j or not. However, 
as the interfaces do not intersect, the time evolution has 
to be evaluated only for those phase points x that are 
in between the two interfaces i and j. In case i < j, 
we know in advance that tA{x) < £ (x) if X(x) < Xi 



m is 



and t°' J {x) > fj J (x) if X(x) > Xj. When the syste: 
ergodic, both interfaces i and j will be crossed in finite 

time and thus h\, (x) + h b j4 ( x ) = H,A X ) + H,i( x ) = L 
The two backward characteristic functions define the TIS 
overall states A and B: 



h A (x) = h b Jx), h B {x) = h b (x). 



(5) 



B. TIS formalism 

Let Xt denote a point in phase space defined by the 
position r and momenta p of all particles in the system 
at time t, xt = {r(t),p(t)}. Although the expressions 
derived in this paper are also valid for stochastic dy- 
namics, we assume here that the system is deterministic: 
Xt = f(xt>,t — t') = f(xa,t). Evaluation of the time prop- 
agator function f(x,t) requires the integration of motion 
(e.g. by means of MD) starting from configuration x over 
a time interval t. 

The TIS method is based on the measurement of the 
flux though dividing surfaces. For this purpose we define 
a set of n non-intersecting multidimensional interfaces 
{0, 1 . . . n} described by an order parameter X(x) which 
is a function of the phase space point x. In this way, 
interface i is the Af— 1 dimensional surface {x|A(a;) = Xi} 
for a system with AT degrees of freedom. We choose Xi , 
i = . . .n such that A^_i < Xi, and that the boundaries 
of state A and B are described by Ao and A„, respectively. 
To derive the TIS rate expression we need to introduce 
characteristic functions that do not only depend on the 
instantaneous position, but on the whole trajectory Xt- 
For each phase point x and each interface i, we define a 
backward time t\{x) and forward time t{(x): 



- max [{t\X(x t ) = Xi A t < 0}] 

+ mm[{t\X(x t ) = A z At > 0}] , (3) 



which mark the points of first crossing with interface i on 
a forward (backward) trajectory starting in xq. Note that 



Together, the overall states cover the entire phase space 
and, within certain limits, do not sensitively depend on 
the precise boundaries of stable states A and B as long 
as they are reasonable. That is, the stable states A and 
B should not overlap, each path from A to B should be a 
true reaction for the case of interest, and the chance that 
a trajectory starting in A and ending in B will return to 
A should be as unlikely as a complete new event. Within 
this formalism the rate constant can be written asii: 



kAB = 



hA(xo)h B (x ) 
{h^(x )) 



(6) 



where the dot denote the time derivative at t — and the 
brackets (. . .) denote the equilibrium ensemble averages. 
In the above equation, one can replace Xq by xt for any 
t and we will often skip the argument Xq when it does 
not lead to confusion. Eq. JBJ measures the effective flux 
through the phase space hyper-surface dividing A from 
B. To express this effective flux in terms that can be 
computed we define the general flux function 

<M^o) = h)Ax ) i™ -& 9 ( At - tf i( x o)) (7) 

with 0(x) the Heaviside step-function. In principle, the 
flux expressions are defined in the limit At —> where 
it converges to: ^ij = X\S(X(x) — Aj). In practice, 
however, Eq. J7J) will be more convenient with At equal 
to the MD time step. This function measures the velocity 
through interface i at t — while coming directly from j 
without having recrossed i. 
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With this notation we can write Eq. JgJ as 

k AB = {4>nfl) / (h A ) = UifiK o) I (M (8) 



for each < i < n, The ensemble average (</>i,o) is the 
effective positive flux from A through i. This means that 
we only count those phase points that will cross interface 
i in the positive direction in one At time step, and come 
directly from A, or equivalently, x should be a first cross- 
ing point of the corresponding trajectory that starts in 
A. 

For the reverse reaction rate, we can write similar ex- 
pressions, but then related to the effective negative flux: 



k 



BA 



f>o, n ) I (he) = (<A 



/(he)- (9) 



Note that this effective flux formalism has a lot of flex- 
ibility. The second equality in Eq. JSJ) and © is true 
for any interface Xi , independently on its position on the 
barrier or shape. If transition state theory (TST) is valid 
and we could choose our order parameter function X(x) 
such that {x\X(x) = Xi} is exactly the transition state 
dividing surface, all points on this surface directing to B 
would contribute to the rate. In that case, Eq. JHJ would 
become equivalent to the TST expressioniS. However, 
for complex systems it is almost impossible to determine 
the multidimensional dividing surface or it would require 
a lot of a-priori knowledge. Instead, the strategy of TIS is 
to relate the effective positive flux through one interface 
to that through one closer to A. 

By introducing the weighted ensemble average (g(x)) 
for an observable g(x) and a weight function uj{x): 



(g(x)u(x)) 
(u(x)) 



(10) 



we can rewrite the rate expression (JHJ) into a product 
of terms that can be measured as conditional ensemble 
averages. 



kAi 



(0i.o) 
(hA) 

(0i,o) 
(hA) 



a n,0 



n (^+i,o 



(ii) 



In the last equality we have used the TIS relation 
(Eq. (16) in Ref. [ill). The strength of the rate ex- 
pression (|llfl is that it rewrites Eq. © as a product of 
conditional probabilities each factor being much higher 
than the final rate. This recast expression allows a bet- 
ter accessible route for the computational approach, as 
it drastically reduces the number of necessary MC moves 
required for an accurate calculation of kAB- The actual 
algorithm consists of the computation of the effective flux 
from A through Ai, (0i,o) / {h A ), by means of standard 
MD, followed by the determination of the conditional 
probabilities (h{ +1 via a path sampling procedure. 



The advantage of TIS over the TPS rate constant algo- 
rithm is that the path length is variable so that each path 
can be limited to its strict necessary minimum length. 
Moreover, the effective positive flux formalism ensures 
that only positive terms are accounted in the Monte 
Carlo scheme, which improves the convergence. Still, for 
diffusive systems path lengths can become exceedingly 
long, making an accurate calculation problematic, even 
when using TIS. 



C. PPTIS formalism 

In previous section, we have reformulated the TIS the- 
ory in a way that facilitates the step toward PPTIS. It is 
important to note that the PPTIS formalism is also based 
on a relation between effective fluxes, however, not only 
in the positive, but also in the negative direction. The 
algorithm allows a more efficient evaluation of the for- 
ward rate kAB and, besides, also gives the reverse rate 
ksA with negligible extra costs. 

It is convenient to introduce a short notation for the 
effective flux function 



*fr(x) 



<j>ij(x)h\ r 



(12) 



In this notation, the ensemble average of <E>™q° is the effec- 
tive positive flux from A through going to B. Renor- 
malizing with (4>ij) defines the conditional probabilities 



(13) 



In words, this is the probability for the system to reach 
interface I before m under the condition that it crosses 
at t = interface i, while coming directly from interface 
j in the past (see Fig. |2J) . The rate constants can now be 
written in terms of these probabilities 



kAi 



7T-^nolo)> k BA = 



' ■ , - > -P(»|r l ), (14) 



(h B ) 



In addition, we define the one-interface crossing proba- 
bilities pf , pf , p= , and p\ . 



pT = PQ+l\li), P t=PQtl\i +1 ), (15) 

which fulfill the following relation: 

pf+pT=pf+p t i =1, (16) 

A schematic visualization of -P(mJ}) an d the probabili- 
ties (pf,Pi",pf,pl) is given in Fig. |2J We also define 
long-distance crossing probabilities and P~ , similar 



to those in Sec. Ill AI 



(17) 
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The main assumption in PPTIS is that trajectories 
lose their memory, over a short time, and hence over a 
short "distance" , as measured by A. We require that the 
interfaces are set such that no memory effects are present 
over more than the distance between two interfaces or, 
equivalently, that following relation is obeyed: 



(18) 



with q an integer larger than one and g(x) any observ- 
able corresponding to the actual state x or any future 
state. With this assumption we can derive recursive re- 
lations for the long-distance crossing probabilities using 
the PPTIS concept introduced in Sec. Ill Al fsee Appendix 
ED: 



PJ = 



P7 = 



T) ± P + 



P.i-1+Pi 



(19) 



To solve these recursive expressions we start with P^~ = 
Pi = 1, after which we iteratively determine (P + ,P~) 



m 




FIG. 

probabilities P( m |j) and (p~,pT,P 



2: Visualization of the different conditional crossing 
1 ' T p\). Left: Explanation 



for P( m |})- The condition on x, \)), is represented with 
the solid line: From x we must cross interface Xi in one At 
time step and, besides, when propagating backward in time it 
should not cross Xi another time before crossing Xj. A possi- 
ble continuation of the trajectory corresponding to a positive 
contribution to ( m | is given by the dashed line. The evalua- 
tion of f (ml}) requires the evaluation of all possible x fulfilling 
the condition and the measurement of the fraction for which 
the forward propagation crosses A; before A m . This proba- 
bility can be evaluated for any possible set of four interfaces 
{Xi, Xj, Xi, X m }. However, if we keep the interfaces Xi, Xj and 
A; on its position and place A m between Xi and A; we get the 
trivial case P( l m \j) = 0. Similarly, we get P( m |}) = 1 if we 
place A m at the right side of A;. Right: visualization of the 
one- interface crossing probabilities (pf ,p~ ,pf ,p\). Possible 
trajectories that correspond to a positive contribution of these 
probabilities are shown. 



for j = 2, . . . until j = n. Substitution of the long dis- 
tance crossing probabilities into Eq. I|14[l results in 



kAB 



i j 



kBA = 



(he) 



!.") p- 



(20) 



We obtain the reverse rate and the equilibrium con- 
stant C = kAB /ksA without any significant extra costs, 
whereas in other path sampling methods the calculation 
of the reverse rate would require another comparable 
computational effort. 



D. The Sampling 



pf,P 



The PPTIS method requires the determination of the 
^ and p\ probabilities. However, pf and p~ are 



defined in a different ensemble than pJ and pj . In most 
cases, it will be convenient to calculate the four proba- 
bilities simultaneously. Therefore, we define an ensem- 
ble that includes both ensembles via the weight function 

(f>i±(x): 



In this ensemble, pf and pf equal 



-T — 



-l,i+i 



i,i+l 



(21) 



(22) 



and p~ and p\ follow from Eq. (|16|) . 

For a correct sampling of phase points Xq in this ensem- 
ble, we generate all possible paths starting from interface 
i — 1 or i + 1 and ending either by crossing i — 1 or i + 1 
with at least one crossing with i. The sampling is per- 
formed using the shooting move as explained in Ref. 11 
with the difference that there is no need to reject the 
backward integration as it is allowed to reach either i — 1 
or i + 1. All paths are confined within A;_i and A.;+i and 
have, even in the case of multiple Xi crossings, only one 
time-slice x along the path for which (f>i± (x) ^ 0. This 
defines the phase point xq. A big advantage is that the 
time reversal moves become now very efficient, as they 
are cheap and will always result in a new phase point xq 
of the ensemble. 



E. Position of the Interfaces 

Contrary to the TIS technique, where the interfaces 
should be close to obtain good statistics, the interfaces 
should be sufficient apart in the PPTIS method to en- 
sure complete loss of memory. A simple test for Eq. I|18(l 
would be to measure (g(x)} ^. i for different separations 

between A^ and Aj_i. The velocity A at the crossing point 
through Xi would be a good candidate for the function 
This test is time consuming if it has to be applied for 



6 



all possible interface separations. However, one can es- 
timate the memory loss for interface separations smaller 
than the chosen one during the rate constant calcula- 
tion. If the interfaces are sufficient apart, one obtains a 
reasonable validation that the memory is vanished before 
reaching the next surface. Substituting A(x) into Eq. Ijl8(l 
gives 



X{x ] 



X(x )) 



(23) 



This relation can be rewritten in the ensemble of <j>i±: 



\{x F )h f i+li _ x {x ) 



4>i± 



h i+i,i-i( x o)) 



li+l.i— 1/ \ 



i{24) 



where xf = J(xq, min[t^_ 1 (xo), </ +1 (^o)] ) is the paths 
endpoint and X(xf) its velocity. A similar expression 
can be derived for the reverse direction. The endpoint 
velocity X(xp) is indicatory for the path's likelihood to 
progress along the order parameter A. Therefore, we can 
reasonably expect that if Eq. I|24|) is true for all interfaces 
Ai, the systematic error in the overall crossing probabil- 
ity due to the memory loss assumption will be small. 
Criterion l|24|l is obeyed if the endpoint velocities of the 
( — h)- and (++)-paths are the same, or if there are no 
(++)-paths present at all. The first case is true if the 
barrier is relatively flat and the interfaces are sufficiently 
far apart. The second case is typical for the system go- 
ing uphill. If the system is going downhill without having 
reached the basin of attraction of the product state, the 
memory loss requirement will demand a careful exam- 
ination on both the order parameter and the interface 
positions. 

An quantitative indication of the fulfillment of the 
memory loss criterion can be obtained by defining a 
memory- loss- function (MLF), for instance the ratio of 
the two terms at both sides of the equality in Eq. (|24|l . 
If we use a fine grid of n su b sub-interfaces between A^_i 
and Ai+i (See Fig.[3Jl, we can measure this function with 
a resolution of <5A = AX/n su b with AA = A^ — A^_i 
A, . — A 4 21 . The function MLFi(j8X) with j = 1 . . . n su b 
can be calculated in the <fii± ensemble during a PP- 
TIS simulation. To do this we will need an additional 
MC move when the path has multiple recrossings with 
interface i. On the path in Fig. [3] between A^_i and 
Ai+i only one phase point (1) belongs to the ensemble 
0Ai±AA = 4>i±- However, in the ensemble defined by the 
two most inner sub-interfaces 4>Xt±sx three points belong 
to the ensemble (1,2 and 3). Therefore, an additional MC 
move is required to measure MLF(j<5A) for jSX < AX. 
This works as follows: For each path in the <pi± ensem- 
ble, we loop over all sub-interfaces j. For each j, first 
collect all the phase points that belong to the ensem- 
ble of <pXi±jSX- Secondly, sample the MLF function for a 



random point out of the n points {x a 



(!) J 2 ) 
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FIG. 3: Illustration for the calculation of the MLF on the 
grid of sub-interfaces. One possible path is shown confined 
between Ai_i and Aj+i. 



which 4>Xi±j8x{xo) ^ 0. Third, take a uniform random 
number a between [0 : 1] and repeat the second step 
if a > 1/n. Otherwise, continue the loop over j until 
j = nsub. Finally, generate a new path, and repeat the 
whole procedure. 



F. Comparing TIS with PPTIS 

In order to make a proper efficiency comparison be- 
tween the two methods, we need to estimate the compu- 
tational effort for a certain fixed error. We rather calcu- 
late the error in the equilibrium constant C = kj^g / kg a 
instead of in the rate fc^B itself because the expres- 
sion of C in terms of the averages, that have to be 
calculated separately, is much simpler than the recur- 
sive expression l(T§|) of Hab- Hence, the error propaga- 
tion from the error in the individual terms is simpler 
and yields a more transparent comparison with TIS. As 
Pj~/Pj~ = Pjii/Pf-i ■ pf-i/pj-i, the equilibrium con- 
stant C can be written as: 



PPTIS 



[(M(h A )} 



[<0„-i,„>/(M] 



Pn-l 
Pn-1 



Pt 



Pf 



(25) 



Each term within brackets [. . .] is calculated separately 
together with its error. The error propagation of the total 
n + 1 terms determines the final overall error. Similarly, 
in TIS the expression for C can be written as: 



G 



TIS 



[V A (X n \X n - 1 )]---[V A {X 2 \X 1 )] 



[(<f>n-l,n)/(h B )} [7>b(A |Ai)] • • • [PB(A n _ 2 |A n _0 



■26) 



with V A (Xj\Xi) = P(i\h) and V B {Xj\Xi) = P&fc). 
Here, in total In simulations have to be performed, each 
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on a different ensemble. In practice, however, not all 
the interface ensembles are needed, as Pa(^i\K-i) an d 
^(AilAi+i) will converge to unity in the limit i — ► n 
and i — > 0, respectively. Interestingly, for PPTIS the er- 
ror in the terms [pf /pf] will be more or less the same 
for all i on the barrier. For TIS, however, the error in 
[Pa/b(M wm decrease when its value gets closer 

to unity. In contrast, the path length required for the 
calculation of these crossing probabilities will increase, 
while, again, being more or less constant in PPTIS. In 
Sec. IIII CI we will show that the final TIS computation 
time, that involves these two effects, scales quadratically 
as function of its barrier length and only linearly for PP- 
TIS. 



III. NUMERICAL RESULTS 

A. The model system 

In Refs. and [HI, the TPS and TIS methods were 
tested on a bistable diatomic molecule immersed in a 
fluid of purely repulsive particles. Here, we use a very 
similar system to test the PPTIS method, consisting of 
N two-dimensional particles interacting via the Weeks- 
Chandler- Andersen (WCA) potential^ 



V W CA{r) = 



4e[(r/o-)- 12 - (r/er)- 6 ] + e if r < r 
if r > r , 

where r is the interatomic distance, and ro = 2 1 ^ e a. In 
the following we will use reduced units so that the energy 
and length parameters e and a, the mass of the particles 
and the unit of time (mcr 2 /e) 1 / 2 are all equal to unity. In 
addition, two of the N particles are interacting through 
a double well potential 

Vdw(r) if r<r + w 

Vdij f(r) — <h if r + w < r < r + w + b , 

Vdw(r-b) iir>r + w + b 



where 



V dw {r) = h[l - (r - r - wf /w 2 



(28) 



(29) 



This potential and its first derivative are continuous and 
the forces are therefore well defined. It has two minima 
at r = ro, the compact state or state A, and at r = 
ro + 2w + b, the extended state or state B. The minima 
are separated by a total barrier of length b+2w and height 
h. For sufficiently large values of h, transitions between 
the states become rare and the rate constants are well 
defined. For sufficiently large values of b, trajectories 
on the barrier plateau become diffusive. We therefore 
expect this system to be a good test case for the new 
PPTIS method. 

We simulate the system at constant energy E = 1.0 
in a square box with periodic boundary conditions. The 



number density is fixed at 0.7, by adjusting the size of the 
box. The barrier length should always be less than half 
the box's edge, implying the number of particles N to 
increase accordingly with the value of the barrier length 
b. The remaining barrier parameters are set to h = 15 
and w = 0.5. The total linear momentum is conserved 
and is set to zero. The equations of motion are inte- 
grated using the velocity Verlet algorithm with a time 
step At — 0.002. The Monte Carlo path sampling is car- 
ried out both in PPTIS and TIS by means of the shoot- 
ing move and the path-reversal move, as explained in 
Sec. ED] and Ref. gj. The two moves were performed 
with an equal probability of 50%. The momentum dis- 
placement for the shooting move was always gaged such 
that the acceptance ratio is about 40%, which provides 
an optimum efficiency of the sampling^. The intermolec- 
ular distance r is a suitable order parameter to define the 
interfaces. 

In the following two subsections we consider a system 
with a barrier short enough to gather good statistics in a 
reasonable computation time. In section fill CI we study 
the gain in efficiency of PPTIS over TIS as a function of 
the diffusive barrier length. In the final section ITlI Dl we 
test the memory loss assumption as explained in Sec. Ill El 



B. System with short barrier 

We simulated a system of AT = 100 WCA particles 
with a barrier length b — 2. The minima of Vdiffir) 
are located at r ~ 1.12 and r ~ 4.12, and the diffusive 
plateau extends from r ~ 1.62 to r ~ 3.62. State A 
is defined by interface Ao as r < 1.22 and state B by 
interface A17 as r > 4.02. In the intermediate regime 16 
interfaces were chosen at r — 1.24, 1.34, 1.40,1.46, 1.52, 
1.62, 2.02, 2.42, 2.82, 3.22, 3.62, 3.72, 3.78, 3.84, 3.90, 
and 4.00. 

First, we ran straightforward MD simulations in state 
A and B to compute the fluxes that appear in both 
Eq. I|25|) and l|26|) by counting the number of positive 
crossings through interfaces Ai and Ai6, respectively^. 
We obtained the values (^i, )/(/u) = 0.1160 ± 0.0008 
and (4>is,n) / (his) — 0.117±0.001. Subsequently, we cal- 
culated the conditional probabilities in Eq. (|25l) . For PP- 
TIS we calculated the one-interface crossing probabilities 
for all the 16 interfaces on the barrier, while TIS simu- 
lations show convergence after 11 windows for both the 
forward and the backward reaction path. In Fig. 0] we re- 



k AB /W- 1U kBA/W 



-10 



C 



PPTIS 2.75±0.07 
TIS 2.8±0.2 



1.95±0.04 
2.03±0.06 



1.41±0.05 
1.4±0.1 



TABLE I: Comparison of PPTIS and TIS. Forward and back- 
ward rate constants as well as the equilibrium constant are 
reported for the system with short energy barrier. 
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Interface 

FIG. 4: Top: PPTIS one-interface crossing probabilities p , 
p T , see Eq. 1 lot . Thep = , probabilities follow directly from 
Eq. 1161 . Bottom: PPTIS long-distance crossing probabilities 
Pf , P~ , see Eq. 1171 . The last points contribute to the rate 
constants as in Eq. 12011 . In both graphs the error is within 
symbol size. 



port the one- interface crossing probabilities pf 1 , pf and 
the long-distance crossing probability P^~ , P~ . The long- 
distance crossing probabilities appearing in the rate con- 
stant Eq. (H for n = 17 are P+ = (2.37±0.06)10" 9 and 
P~ = (1.67 ± 0.03)10~ 9 . These values can be compared 
with their TIS counterparts V A {K\^i) = (2.4±0.2)10 -9 
and Pi3(A |A„_i) = (1.74 ± 0.05)10" 9 . We note that be- 
cause for the first 5 interfaces i — 1 . . . 5, pf equals unity, 
P.~~ is constant up to i = 6. Similarly, for i = 11 . . . 16, 
p i is unity and Pf~ shows a plateau starting at i = 12. 
This means that in the PPTIS methods, although for the 
equilibrium constant C all the windows are necessary, 
the separate computation of k^B and ksA requires fewer 
windows. The result is consistent with what we found in 
TIS. We report in table [I] the final rate and equilibrium 
constants. They all coincide within the statistical error. 

Another way to derive the equilibrium constant is by 
using the relation C = exp(AF / kT) where k is Boltz- 
mann's constant, T is the temperature, and AF — 
Fa ~ Fb is the free energy difference between states A 
and B. To check our results we calculated the free en- 
ergy as function of the intermolecular distance r using 
a straightforward MD simulation with a bias potential 



20 1 tt 




FIG. 5: The free energy as function of the dimer inter- 
atomic distance r. It was calculated from the dimer potential 
Vdiff{r), Eq. 12811 . corrected by a factor involving p(r), the 
probability of finding the dimer atoms at distance r, as re- 
ported in Eq. 13U1 . The function p(r) was computed using 
a biased MD simulation and is plotted in the inset. From 
the minima of F(r) we derived the free energy difference 
AF = Fa — Fb and hence the equilibrium constant. 

given by — Vdiff(r). This is equivalent to simulating a 
system of pure WCA particles and computing p(r), the 
probability of finding two particles at a distance r. The 
advantage is that in such a system it does not make any 
difference which two molecules we consider to compute 
p(r) and we can thus increase the statistics by averaging 
on all pairs. It can be easily seen that 

F(r) = V di ff(r) - kT\np(r) + constant (30) 

where in our microcanonical simulations the temperature 
T is derived from average kinetic energy. A plot of the 
free energy is shown in Fig. [SJ From this curve we derive 
the free energy difference AF between the two minima, 
and the equilibrium constant C — 1.369 ± 0.001. All re- 
sults are consistent with each other within the statistical 
error. 



C. Efficiency Scaling 

In both the PPTIS and the TIS method the final equi- 
librium constant is a product of factors given by Eqs. l(2*B|) 
and l|26|) . respectively. We determined each factor in- 
dependently by performing M simulation blocks of m 
Monte Carlo cycles. We adjusted m so that the relative 
standard deviation of each term in Eqs. (|25|l and (|26ll af- 
ter M block averages was an arbitrary value of 3%. We 
measured, under the same computational conditions (1.4 
GHz AMD Athlon), the CPU-time required and summed 
up all the times to get the relative efficiency. The final 
errors on the rate constants given above were obtained by 
standard propagation rules using all the available blocks 
of simulations. 
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FIG. 6: Comparison of path-lengths for PPTIS and the TIS 
simulations for the calculation of the forward and backward 
rate constant. Because of the diffusive character of the sys- 
tem, the TIS path-lengths keep growing as the interface moves 
further from the initial stable state. The PPTIS path-lengths 
on the contrary stay constant. The errors are within the sym- 
bol size. 



We computed the computation times to reach the pre- 
fixed 3% error for each factor in Eqs. i|25|) and i|26|l and 
found that for the simple dimer system the efficiency of 
PPTIS is at least a factor two higher than TIS. In figure 
|S]we plot the average path-length in each window for the 
two methods. The direct comparison shows that on the 
barrier PPTIS keeps the path length constant while the 
TIS path length increases. This is expected but it does 
not directly imply a gain in efficiency as the relative error 
in the TIS terms is smaller for the longer paths. 

In order to compare the effciency of both PPTIS and 
TIS quantitatively as a function of barrier length we con- 
sider the windows i — 1 . . . Nw- In each window we per- 
form simulations of m Monte Carlo cycles. Let the aver- 
age path length in each window be Li and relative error 
in the observable (e.g. hopping probability) be ti. If we 
assume that m is large enough so that all simulations 
are uncorrelated, then ej scales as the inverse square root 
of m. To obtain a fixed error e, one has to rescale the 
number of paths by (ei/e) 2 . Moreover, the acceptance 
ratio is almost independent of the path length for TIS in 
the kind of systems we have studied hereSS. As a result 
we found that the required CPU time for m MC cylces 
scales linearly with its average path length. The total 
computation time is then proportional to 



N w 

'S> 7 ' 



(31) 



If the barrier is very long we can neglect the initial and fi- 
nal windows on the steep side of the barrier and consider 
only those on the plateau for which we assume a fixed 
interface separation A A = b/Nw- The PPTIS method 
keeps Li and €i more or less constant (see Fig. [S] ) . The 



efficiency r] is the ratio of the TIS to the PPTIS compu- 
tation time 



V 



CPU 



TIS 



CPU 



E 



Nw t c 2 
1 ^i e i 



PPTIS 



Nu 



(32) 



where Li and are now the TIS path-length and relative 
error for window i. To study the behavior of Li and ej 
we focus on the TIS calculations for the forward reaction 
rate constant fc^s ■ The observables are the probabilities 
V A {X i+l \Xi) (seeEq. ®). 

To estimate the TIS effective computation time as 
function of the barrier length, we first examined the 
model of Sec. Ill Al with tj^ii = i (uniform symmet- 
ric hopping). This simplified system allows us to obtain 
some analytical results and to perform path sampling on 
wide barriers with millions of paths. We found that the 
relative error tj in the long distance hopping probabil- 
ities T[l — >q] scales as ~ ^j. Moreover, the average 
length of the corresponding path (the number of hops) 
scales quadratically with j, while the acceptance ratio 
remained constant. 

To test whether this scaling behavior also applies to 
the dimer model, we considered a system of N — 256 
particles with a barrier length 6 = 6. The minima of 
Vdiff(r) are located at r ~ 1.12 and r ~ 8.12. State A 
is defined by interface Ao as r < 1.20. We defined 22 
other interfaces, 16 of which on the barrier plateau from 
r ~ 1.62 to r ~ 7.62 at intervals AA = 0.4. By running 
several TIS simulations, we computed the crossing prob- 
abilities P^Ai+ilAi) and their standard deviations for 



0.1 



0.01 




FIG. 7: The relative variance for the TIS crossing probability 
7 ? A(Ai+i|Ai) plotted as function of the barrier length for a 
system with total barrier length 6 = 6. The relative variances 
have been rescaled to the one of the first interface measured 
from the start of the plateau. We fitted the relative variance 
with an inverse linear function. Inset: the the average path 
length for these simulations as function of the barrier length. 
The error bar is within symbol size. The solid line is a second- 
order polynomial fit. 
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i = 1 ... 21. In Fig. [7| we show the relative variance and 
the average path length for the windows on the barrier. 
Indeed, the scaling behavior is as expected. Evaluating 
the sum in Eq. I|32[l yields that the relative efficiency rj 
scales linearly with the barrier width. This means that 
while the computation time scales quadratically with the 
barrier width for TIS it scales linearly for PPTIS. 

D. Validity of the memory loss assumption 

We computed the memory loss function MLF(jSX) as 
defined in section III El for the same 6 = 2 system of sec- 
tion IIII 51 using the method described in Sec. Ill El with a 
central interface at r = 2.62 and <5A = 0.01 and j ranging 
from 1 to 100, corresponding to the entire length of the 
barrier plateau. Since not only the mean value of the 
endpoint velocity A but its complete probability distri- 
bution /(A) should be equal for paths of the ensemble 
(— +) and (++) we computed the overlap 




/±(A)/t(A) dX. (33) 



Similar expression was used for the paths (H — ) and 

( ). The results are reported in Fig. 00 It can be 

seen that for j5X > 0.2 the memory loss assumption is 
satisfied. 

In umbrella sampling methods, where the phase space 
is divided into partially overlapping windows which are 
later matched, the choice of the windows is a trade-off 
between the diffusion of the order parameter and its equi- 
libration time in the window 2 . Because of the first effect 




0.2 0.4 0.6 0.8 1 



FIG. 8: Memory loss function computed using the overlap of 
the distributions of the endpoint velocity A, see Sec. IIII Dl 
In the insets we plot the distributions for paths of the (+— ) 

ensemble (solid line) and the ( ) one (dashed line), for two 

different window sizes jSX = 0.01 and jSX = 0.2. The first 
two distributions are different, and the second ones are almost 
overlapping. 



the size of the window should be chosen as small as pos- 
sible, while the second puts a lower bound on it. Similar 
considerations apply to PPTIS with the addition of the 
memory loss requirement, which is also a lower bound to 
the window size. Taking all this into accounts we believe 
AA = 0.2 is an optimal choice. The optimal value of AA 
is of course dependent on the system and on the choice 
of order parameter, and has to be estimated by trial and 
error. 



IV. DISCUSSION AND CONCLUSIONS 

In this paper we have introduced a path sampling algo- 
rithm for the efficient calculation of rate constants of two 
state activated processes with a diffusive barrier. The 
method is based on the division of phase space by in- 
terfaces. We then calculate hopping probabilities from 
one interface to another, using transition path sampling 
shooting moves and time reversal moves as our basic in- 
strument to create new paths on each interface ensemble. 
Using either the iterative scheme given here or for more 
general hopping networks the method of kinetic Monte 
Carlo, one can solve the master equation and obtain 
the final forward and backward rate constants. In de- 
riving this algorithm we assumed complete memory loss 
between interface, such that the system becomes essen- 
tially Markovian, thus validating the use of kinetic Monte 
Carlo and similar algorithms. We showed that for a rela- 
tively simple system, the diatomic molecule, the memory 
loss assumption (loss of correlation) holds over the entire 
barrier. We expect that for more complex systems this 
memory loss requirement will certainly be fulfilled, pro- 
vided that the dynamics has a stochastic character and 
the interfaces are placed sufficiently far apart. However, 
the choice of order parameter requires still some caution, 
possibly more than in TIS, in order to satisfy the mem- 
ory loss requirement. For the simple dimer system, we 
showed that PPTIS is already twice as fast as TIS. More 
importantly, we argued that the computation time scales 
linearly with the barrier length, instead of quadratically 
as for TIS and maybe even with a higher power for TPS. 
This opens up possibilities for accurate rate constant cal- 
culations for complex activated processes. 

The method advocated here to tackle diffusive barriers 
in complex systems is not the first one that has been pro- 
posed in the literature. Several techniques have been put 
forward in the last decade, for instance the diffusive bar- 
rier algorithm by Ruiz-Montero et ali-& and the coarse 
MD method by Hummer and Kevrekidisi^. The latter 
technique uses short trajectories to calculate the average 
force projected on a order parameter space. They use 
that force to integrate a stochastic equation of motion 
and explore the free energy landscape in that way. Rate 
constants can then in principle be obtained from the dy- 
namics on this coarse grained surface. 

The method of Ruiz-Montero et al. is in essence a re- 
active flux method but enhances the statistics by measur- 
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ing the flux on many different places on the barrier and 
weigh those contributions such that the error in the rate 
constant is as small as possible. The weighing function 
turns out to be proportional to the inverse of the bar- 
rier free energy profile. This means that to get a mean- 
ingful result one should have access to the free energy 
landscape on the barrier, before the rate constant calcu- 
lation. However, due to complexity, the order parameters 
chosen as reaction coordinate are not necessarily correct, 
sometimes resulting in inaccurate barriers and very small 
transmission coefficients. Moreover, the calculation of a 
transmission coefficient suffers from the same quadrati- 
cally dependence of the barrier length. 

We stress that there is a large difference between the 
reactive flux method based on transition state theory and 
the PPTIS technique. Although we use hyper-surfaces 
to divide the phase-space we do not rely on a global 
large transmission coefficient. Instead, we calculate lo- 
cal transmission coefficients and use those as hopping 
probabilities. We believe that the PPTIS method can be 
applied to sample diffusive pathways and calculate rate 
constants in many different complex systems behaving 
more or less diffusive, such as protein folding, nucleation, 
chemical reactions, biochemical networks, and gas diffu- 
sion through membranes. In the near future we intend to 
improve possible sampling problems occurring when the 
used order parameter is a very bad reaction coordinate. 



Using 1— tj-i 



3-2 



ij-i.j, we see that Eq. (|36|l is equiva- 



lent to the second line in Eq. The first line of Eq. (J2J 
is then obtained by the substitution into Eq. I|34|l. 
VI. APPENDIX B 

The criterion of Eq. I|18l) gives for any positive integer 
q > following approximate relations: 



p(l \i \ ~ p(l I 

\m\i±Q ) vml 



pfcZ\Ui)(pT/pf) 



(37) 



With this in mind we can start a derivation similar to 
Appendix [V] 
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and for the reverse direction we can write: 



In this appendix we will derive the recursive relations 
(j2J for the chain of metastable states. For the transfer in 
the positive direction we can write 



= r[i^ - 1 ](i-rb'-i-?]) 

and for the reverse direction 



(34) 



T[J - 1 ->?] = ', i., il J - 2 

= tj-i,j-2 ( T\j - : 



>° l 



t i -i li - 2 (T[7-2-.5_ 1 ] + 
{1-T[j-2^ j _ 1 })T[j-1 



(35) 



Bringing the T[j - 1 -►!?] terms of Eq. to the left 
side gives us: 



T\j - 1 



tj-i,j-2T\j - 2 



>° 1 



l-^-i J - 2 (l-T[j-2^_ 1 ] 



(36) 



= pQV)=pU p W) 

« ^ 1 [p(?_ 1 i^?)+p(r 1 r 1 )p(°i^)] 

= pJ-i[pj--i + (i-pj--i)pQt-l)] _ 

i-l\Pj-l-\ 



pj-l[Pf-l + (l-Pi-l)P{ 



0\j- 
3 \j 



P 



3-1 



Pj-i 



Bringing the P, terms to the left results in: 



(39) 



Pi-iPj-i 



(40) 



With the help of Eq. i|16fl we can see that this is equiva- 
lent to expression (|19fl . Substitution of this relation into 
Eq. l|3^|) results in the expression for P. in Eq. (Tffi . 
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